A Synopsis Based Approach for Itemset Frequency Estimation over Massive Multi-Transaction Stream

نویسندگان

چکیده

The streams where multiple transactions are associated with the same key prevalent in practice, e.g., a customer has shopping records arriving at different time. Itemset frequency estimation on such is very challenging since sampling based methods, as popularly used reservoir sampling, cannot be used. In this article, we propose novel k -Minimum Value (KMV) synopsis method to estimate of itemsets over multi-transaction streams. First, extract KMV synopses for each item from stream. Then, estimator an itemset synopses. Comparing existing estimator, our not only more accurate and efficient calculate but also follows downward-closure property. These properties enable incorporation new frequent mining (FIM) algorithm (e.g., FP-Growth) mine To demonstrate this, implement FIM by integrating into algorithms, prove it capable guaranteeing accuracy bounded size synopsis. Experimental results massive show can significantly improve both estimating compared estimators.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Frequent Itemset Mining over Stream Data: Overview

During the past decade, stream data mining has been attracting widespread attentions of the experts and the researchers all over the world and a large number of interesting research results have been achieved. Among them, frequent itemset mining is one of main research branches of stream data mining with a fundamental and significant position. In order to further advance and develop the researc...

متن کامل

Semi-Blind Channel Estimation based on subspace modeling for Multi-user Massive MIMO system

‎Channel estimation is an essential task to fully exploit the advantages of the massive MIMO systems‎. ‎In this paper‎, ‎we propose a semi-blind downlink channel estimation method for massive MIMO system‎. ‎We suggest a new modeling for the channel matrix subspace. Based on the low-rankness property, we have prposed an algorithm to estimate the channel matrix subspace. In the next step, using o...

متن کامل

High Utility Rare Itemset Mining over Transaction Databases

High-Utility Rare Itemset (HURI) mining finds itemsets from a database which have their utility no less than a given minimum utility threshold and have their support less than a given frequency threshold. Identifying high-utility rare itemsets from a database can help in better business decision making by highlighting the rare itemsets which give high profits so that they can be marketed more t...

متن کامل

A hybrid approach for database intrusion detection at transaction and inter-transaction levels

Nowadays, information plays an important role in organizations. Sensitive information is often stored in databases. Traditional mechanisms such as encryption, access control, and authentication cannot provide a high level of confidence. Therefore, the existence of Intrusion Detection Systems in databases is necessary. In this paper, we propose an intrusion detection system for detecting attacks...

متن کامل

A Novel Utility and Frequency Based Itemset Mining Approach for Improving CRM in Retail Business

The paradigm shift from ‘data-centered pattern mining’ to ‘domain driven actionable knowledge discovery’ has increased the need for considering the business yield (utility) and demand or rate of recurrence of the items (frequency) while mining a retail business transaction database. Such a data mining process will help in mining different types of itemsets of varying business utility and demand...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: ACM Transactions on Knowledge Discovery From Data

سال: 2021

ISSN: ['1556-472X', '1556-4681']

DOI: https://doi.org/10.1145/3465238